{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Convolutional Neural Networks\n",
    "---\n",
    "In this notebook, we train a **CNN** to classify images from the CIFAR-10 database.\n",
    "\n",
    "The images in this database are small color images that fall into one of ten classes; some example images are pictured below.\n",
    "\n",
    "<img src='notebook_ims/cifar_data.png' width=50% height=50% />"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Test for [CUDA](http://pytorch.org/docs/stable/cuda.html)\n",
    "\n",
    "Since these are larger (32x32x3) images, it may prove useful to speed up your training time by using a GPU. CUDA is a parallel computing platform and CUDA Tensors are the same as typical Tensors, only they utilize GPU's for computation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Import libraries\n",
    "import torch\n",
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "%matplotlib inline\n",
    "\n",
    "# PyTorch dataset\n",
    "from torchvision import datasets\n",
    "import torchvision.transforms as transforms\n",
    "from torch.utils.data.sampler import SubsetRandomSampler\n",
    "\n",
    "# PyTorch model\n",
    "import torch.nn as nn\n",
    "import torch.nn.functional as F\n",
    "import torch.optim as optim"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# check if CUDA is available\n",
    "train_on_gpu = torch.cuda.is_available()\n",
    "\n",
    "if not train_on_gpu:\n",
    "    print('CUDA is not available.  Training on CPU ...')\n",
    "else:\n",
    "    print('CUDA is available!  Training on GPU ...')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "## Load the [Data](https://pytorch.org/vision/stable/datasets.html)\n",
    "\n",
    "Downloading may take a minute. We load in the training and test data, split the training data into a training and validation set, then create DataLoaders for each of these sets of data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# number of subprocesses to use for data loading\n",
    "num_workers = 0\n",
    "# how many samples per batch to load\n",
    "batch_size = 20\n",
    "# percentage of training set to use as validation\n",
    "valid_size = 0.2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Data transform to convert data to a tensor and apply normalization\n",
    "\n",
    "# augment train and validation dataset with RandomHorizontalFlip and RandomRotation\n",
    "train_transform = transforms.Compose([\n",
    "    # TODO transform data with RandomHorizontalFlip()\n",
    "    # TODO transform data with RandomRotation()\n",
    "    transforms.ToTensor(),\n",
    "    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))\n",
    "    ])\n",
    "\n",
    "test_transform = transforms.Compose([\n",
    "    transforms.ToTensor(),\n",
    "    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))\n",
    "    ])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# choose the training and test datasets\n",
    "train_data = datasets.CIFAR10('data', train=True,\n",
    "                              download=True, transform=train_transform)\n",
    "test_data = datasets.CIFAR10('data', train=False,\n",
    "                             download=True, transform=test_transform)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# obtain training indices that will be used for validation\n",
    "num_train = len(train_data)\n",
    "indices = list(range(num_train))\n",
    "np.random.shuffle(indices)\n",
    "split = int(np.floor(valid_size * num_train))\n",
    "train_idx, valid_idx = indices[split:], indices[:split]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# define samplers for obtaining training and validation batches\n",
    "train_sampler = SubsetRandomSampler(train_idx)\n",
    "valid_sampler = SubsetRandomSampler(valid_idx)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# prepare data loaders (combine dataset and sampler)\n",
    "train_loader = torch.utils.data.DataLoader(train_data, batch_size=batch_size,\n",
    "    sampler=train_sampler, num_workers=num_workers)\n",
    "valid_loader = torch.utils.data.DataLoader(train_data, batch_size=batch_size, \n",
    "    sampler=valid_sampler, num_workers=num_workers)\n",
    "test_loader = torch.utils.data.DataLoader(test_data, batch_size=batch_size, \n",
    "    num_workers=num_workers)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# specify the image classes\n",
    "classes = ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']\n",
    "classes"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Visualize a Batch of Training Data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# helper function to un-normalize and display an image\n",
    "def imshow(img):\n",
    "    img = img / 2 + 0.5  # unnormalize\n",
    "    plt.imshow(np.transpose(img, (1, 2, 0)))  # convert from Tensor image"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# obtain one batch of training images\n",
    "dataiter = iter(train_loader)\n",
    "images, labels = dataiter.next()\n",
    "images = images.numpy() # convert images to numpy for display\n",
    "images.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# plot the images in the batch, along with the corresponding labels\n",
    "fig = plt.figure(figsize=(25, 4))\n",
    "# display 20 images\n",
    "for idx in np.arange(20):\n",
    "    ax = fig.add_subplot(2, 20/2, idx+1, xticks=[], yticks=[])\n",
    "    imshow(images[idx])\n",
    "    ax.set_title(classes[labels[idx]])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### View an Image in More Detail\n",
    "\n",
    "Here, we look at the normalized red, green, and blue (RGB) color channels as three separate, grayscale intensity images."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "rgb_img = np.squeeze(images[3])\n",
    "channels = ['red channel', 'green channel', 'blue channel']\n",
    "\n",
    "fig = plt.figure(figsize = (36, 36)) \n",
    "for idx in np.arange(rgb_img.shape[0]):\n",
    "    ax = fig.add_subplot(1, 3, idx + 1)\n",
    "    img = rgb_img[idx]\n",
    "    ax.imshow(img, cmap='gray')\n",
    "    ax.set_title(channels[idx])\n",
    "    width, height = img.shape\n",
    "    thresh = img.max()/2.5\n",
    "    for x in range(width):\n",
    "        for y in range(height):\n",
    "            val = round(img[x][y],2) if img[x][y] !=0 else 0\n",
    "            ax.annotate(str(val), xy=(y,x),\n",
    "                    horizontalalignment='center',\n",
    "                    verticalalignment='center', size=8,\n",
    "                    color='white' if img[x][y]<thresh else 'black')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "## Define the Network [Architecture](http://pytorch.org/docs/stable/nn.html)\n",
    "\n",
    "This time, you'll define a CNN architecture. Instead of an MLP, which used linear, fully-connected layers, you'll use the following:\n",
    "* [Convolutional layers](https://pytorch.org/docs/stable/nn.html#convolution-layers), which can be thought of as stack of filtered images.\n",
    "* [Maxpooling layers](https://pytorch.org/docs/stable/generated/torch.nn.MaxPool2d.html#torch.nn.MaxPool2d), which reduce the x-y size of an input, keeping only the most _active_ pixels from the previous layer.\n",
    "* The usual Linear + Dropout layers to avoid overfitting and produce a 10-dim output.\n",
    "\n",
    "A network with 2 convolutional layers is shown in the image below and in the code, and you've been given starter code with one convolutional and one maxpooling layer.\n",
    "\n",
    "<img src='notebook_ims/2_layer_conv.png' height=50% width=50% />\n",
    "\n",
    "#### TODO: Define a model with multiple convolutional layers, and define the feedforward metwork behavior.\n",
    "\n",
    "The more convolutional layers you include, the more complex patterns in color and shape a model can detect. It's suggested that your final model include 2 or 3 convolutional layers as well as linear layers + dropout in between to avoid overfitting. \n",
    "\n",
    "It's good practice to look at existing research and implementations of related models as a starting point for defining your own models. You may find it useful to look at [this PyTorch classification example](https://github.com/pytorch/tutorials/blob/master/beginner_source/blitz/cifar10_tutorial.py) or [this Keras example](https://keras.io/examples/vision/mnist_convnet/) to help decide on a final structure.\n",
    "\n",
    "#### Output volume for a convolutional layer\n",
    "\n",
    "To compute the output size of a given convolutional layer we can perform the following calculation (taken from [Stanford's cs231n course](http://cs231n.github.io/convolutional-networks/#layers)):\n",
    "> We can compute the spatial size of the output volume as a function of the input volume size (W), the kernel/filter size (F), the stride with which they are applied (S), and the amount of zero padding used (P) on the border. The correct formula for calculating how many neurons define the output_W is given by `(W−F+2P)/S+1`. \n",
    "\n",
    "For example for a 7x7 input and a 3x3 filter with stride 1 and pad 0 we would get a 5x5 output. With stride 2 we would get a 3x3 output."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# define the CNN architecture\n",
    "class Net(nn.Module):\n",
    "    def __init__(self):\n",
    "        super(Net, self).__init__()\n",
    "        # TODO build convolutional neural network with max pooling layers\n",
    "\n",
    "    def forward(self, x):\n",
    "        # TODO add sequence of convolutional and max pooling layers\n",
    "        \n",
    "        return x"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# create a complete CNN\n",
    "model = Net()\n",
    "model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# move tensors to GPU if CUDA is available\n",
    "if train_on_gpu:\n",
    "    model.cuda()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Specify [Loss Function](http://pytorch.org/docs/stable/nn.html#loss-functions) and [Optimizer](http://pytorch.org/docs/stable/optim.html)\n",
    "\n",
    "Decide on a loss and optimization function that is best suited for this classification task. The linked code examples from above, may be a good starting point; [this PyTorch classification example](https://github.com/pytorch/tutorials/blob/master/beginner_source/blitz/cifar10_tutorial.py) or [this, more complex Keras example](https://onlytojay.medium.com/mnist-cnn-optimizer-comparison-with-tensorflow-keras-163735862ecd). Pay close attention to the value for **learning rate** as this value determines how your model converges to a small error.\n",
    "\n",
    "#### TODO: Define the loss and optimizer and see how these choices change the loss over time."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# specify loss function (categorical cross-entropy)\n",
    "criterion = # TODO define categorical cross-entropy loss function\n",
    "\n",
    "# specify optimizer\n",
    "optimizer = # TODO define the optimizer with learning rate"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "## Train the Network\n",
    "\n",
    "Remember to look at how the training and validation loss decreases over time; if the validation loss ever increases it indicates possible overfitting. (In fact, in the below example, we could have stopped around epoch 33 or so!)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# number of epochs to train the model\n",
    "n_epochs = 30\n",
    "\n",
    "valid_loss_min = np.Inf # track change in validation loss\n",
    "\n",
    "for epoch in range(1, n_epochs+1):\n",
    "\n",
    "    # keep track of training and validation loss\n",
    "    train_loss = 0.0\n",
    "    valid_loss = 0.0\n",
    "    \n",
    "    ###################\n",
    "    # train the model #\n",
    "    ###################\n",
    "    model.train()\n",
    "    for data, target in train_loader:\n",
    "        # move tensors to GPU if CUDA is available\n",
    "        if train_on_gpu:\n",
    "            data, target = data.cuda(), target.cuda()\n",
    "        # clear the gradients of all optimized variables\n",
    "        optimizer.zero_grad()\n",
    "        # forward pass: compute predicted outputs by passing inputs to the model\n",
    "        output = model(data)\n",
    "        # calculate the batch loss\n",
    "        loss = criterion(output, target)\n",
    "        # backward pass: compute gradient of the loss with respect to model parameters\n",
    "        loss.backward()\n",
    "        # perform a single optimization step (parameter update)\n",
    "        optimizer.step()\n",
    "        # update training loss\n",
    "        train_loss += loss.item()*data.size(0)\n",
    "        \n",
    "    ######################    \n",
    "    # validate the model #\n",
    "    ######################\n",
    "    model.eval()\n",
    "    for data, target in valid_loader:\n",
    "        # move tensors to GPU if CUDA is available\n",
    "        if train_on_gpu:\n",
    "            data, target = data.cuda(), target.cuda()\n",
    "        # forward pass: compute predicted outputs by passing inputs to the model\n",
    "        output = model(data)\n",
    "        # calculate the batch loss\n",
    "        loss = criterion(output, target)\n",
    "        # update average validation loss \n",
    "        valid_loss += loss.item()*data.size(0)\n",
    "    \n",
    "    # calculate average losses\n",
    "    train_loss = train_loss/len(train_loader.sampler)\n",
    "    valid_loss = valid_loss/len(valid_loader.sampler)\n",
    "        \n",
    "    # print training/validation statistics \n",
    "    print('Epoch: {} \\tTraining Loss: {:.6f} \\tValidation Loss: {:.6f}'.format(\n",
    "        epoch, train_loss, valid_loss))\n",
    "    \n",
    "    # save model if validation loss has decreased\n",
    "    if valid_loss <= valid_loss_min:\n",
    "        print('Validation loss decreased ({:.6f} --> {:.6f}).  Saving model ...'.format(\n",
    "        valid_loss_min,\n",
    "        valid_loss))\n",
    "        torch.save(model.state_dict(), 'model_cifar.pt')\n",
    "        valid_loss_min = valid_loss"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "###  Load the Model with the Lowest Validation Loss"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model.load_state_dict(torch.load('model_cifar.pt'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "## Test the Trained Network\n",
    "\n",
    "Test your trained model on previously unseen data! A \"good\" result will be a CNN that gets around 70% (or more, try your best!) accuracy on these test images."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# track test loss\n",
    "test_loss = 0.0\n",
    "class_correct = list(0. for i in range(10))\n",
    "class_total = list(0. for i in range(10))\n",
    "\n",
    "model.eval()\n",
    "# iterate over test data\n",
    "for data, target in test_loader:\n",
    "    # move tensors to GPU if CUDA is available\n",
    "    if train_on_gpu:\n",
    "        data, target = data.cuda(), target.cuda()\n",
    "    # forward pass: compute predicted outputs by passing inputs to the model\n",
    "    output = model(data)\n",
    "    # calculate the batch loss\n",
    "    loss = criterion(output, target)\n",
    "    # update test loss \n",
    "    test_loss += loss.item()*data.size(0)\n",
    "    # convert output probabilities to predicted class\n",
    "    _, pred = torch.max(output, 1)    \n",
    "    # compare predictions to true label\n",
    "    correct_tensor = pred.eq(target.data.view_as(pred))\n",
    "    correct = np.squeeze(correct_tensor.numpy()) if not train_on_gpu else np.squeeze(correct_tensor.cpu().numpy())\n",
    "    # calculate test accuracy for each object class\n",
    "    for i in range(batch_size):\n",
    "        label = target.data[i]\n",
    "        class_correct[label] += correct[i].item()\n",
    "        class_total[label] += 1\n",
    "\n",
    "# average test loss\n",
    "test_loss = test_loss/len(test_loader.dataset)\n",
    "print('Test Loss: {:.6f}\\n'.format(test_loss))\n",
    "\n",
    "for i in range(10):\n",
    "    if class_total[i] > 0:\n",
    "        print('Test Accuracy of %5s: %2d%% (%2d/%2d)' % (\n",
    "            classes[i], 100 * class_correct[i] / class_total[i],\n",
    "            np.sum(class_correct[i]), np.sum(class_total[i])))\n",
    "    else:\n",
    "        print('Test Accuracy of %5s: N/A (no training examples)' % (classes[i]))\n",
    "\n",
    "print('\\nTest Accuracy (Overall): %2d%% (%2d/%2d)' % (\n",
    "    100. * np.sum(class_correct) / np.sum(class_total),\n",
    "    np.sum(class_correct), np.sum(class_total)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Question: What are your model's weaknesses and how might they be improved?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Answer**: # TODO add your answer here"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Visualize Sample Test Results"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# obtain one batch of test images\n",
    "dataiter = iter(test_loader)\n",
    "images, labels = dataiter.next()\n",
    "images.numpy()\n",
    "\n",
    "# move model inputs to cuda, if GPU available\n",
    "if train_on_gpu:\n",
    "    images = images.cuda()\n",
    "\n",
    "# get sample outputs\n",
    "output = model(images)\n",
    "# convert output probabilities to predicted class\n",
    "_, preds_tensor = torch.max(output, 1)\n",
    "preds = np.squeeze(preds_tensor.numpy()) if not train_on_gpu else np.squeeze(preds_tensor.cpu().numpy())\n",
    "\n",
    "# plot the images in the batch, along with predicted and true labels\n",
    "fig = plt.figure(figsize=(25, 4))\n",
    "for idx in np.arange(20):\n",
    "    ax = fig.add_subplot(2, 20/2, idx+1, xticks=[], yticks=[])\n",
    "    imshow(images[idx] if not train_on_gpu else images[idx].cpu())\n",
    "    ax.set_title(\"{} ({})\".format(classes[preds[idx]], classes[labels[idx]]),\n",
    "                 color=(\"green\" if preds[idx]==labels[idx].item() else \"red\"))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
